System And Method For Seamless Switching Through Buffering

ABSTRACT

A method of preparing data streams to facilitate seamless switching between such streams by a switching device to produce an output data stream without any switching artifacts. Bi-directional switching between any plurality of data streams is supported. The data streams are divided into segments, wherein the segments include synchronized starting points and end points. The data rate is increased before an end point of a segment, to create switch gaps between the segments. Increasing the data rate can include increasing a bandwidth of the plurality of data streams, for example by multiplexing, or compressing the data. The present invention can be used, for example, with MPEG or AC-3 encoded audio and MPEG encoded video segments that are multiplexed into MPEG-2 transport streams. Also included are specific methods for preparing MPEG video streams and multiplexing MPEG video with MPEG or AC-3 audio streams to allow a receiver to create seamless transitions between individually encoded segments.

CROSS REFERENCE TO RELATED APPLICATIONS

This patent application is a continuation of Ser. No. 16/418,457, filedMay 21, 2019, which is a continuation of U.S. patent application Ser.No. 15/393,454, filed Dec. 29, 2016, issued as U.S. Pat. No. 10,341,696on Jul. 2, 2019, which is continuation of U.S. patent application Ser.No. 14/065,132, filed Oct. 28, 2013, issued as U.S. Pat. No. 9,538,257on Jan. 3, 2017, which is a continuation of U.S. patent application Ser.No. 12/911,502, filed Oct. 25, 2010, issued as U.S. Pat. No. 8,571,051on Oct. 29, 2013, which is a continuation of U.S. patent applicationSer. No. 12/106,825, filed Apr. 21, 2008, issued as U.S. Pat. No.7,822,068 on Oct. 26, 2010, which is a continuation of U.S. patentapplication Ser. No. 10/369,047, filed on Feb. 19, 2003, issued as U.S.Pat. No. 7,382,796 on Jun. 3, 2008, which claims the benefit of U.S.Patent Application No. 60/357,804, filed on Feb. 15, 2002, which is acontinuation-in-part of U.S. patent application Ser. No. 09/735,983,filed on Dec. 13, 2000, issued as U.S. Pat. No. 7,490,344 issued Feb.10, 2009, which claims the benefit of 60/236,624, filed on Sep. 29,2000, all of which are incorporated by reference herein in theirentirety.

FIELD OF THE INVENTION

This disclosure relates generally to a system and method fortransmitting data and, more-particularly, to a system and method ofseamless switching between a plurality of data streams.

BACKGROUND

Typical television broadcasts do not allow personalization of televisioncontent to a viewer's profile. The standard television broadcastprovides only one variant of every channel. The channel is selected bythe viewer and the reception equipment (whether a television, aset-top-box, or any means of reception) selects the video and audio forthat channel from the broadcast material. While this system allows theviewer to select their favorite channel or show from the available set,the individual viewer will be watching exactly the same content aseveryone else that selects that channel. Due to the fact that channelsare created to attract a wide range of viewers, viewers typically havedifferent preferred channels. Disadvantageously, this is particularlyevident when a program being broadcast on a channel is interrupted by acommercial advertisement that does not appeal to the interests of theviewer. The inevitable result is that the viewer will switch to anotherchannel to avoid watching that particular commercial advertisement. Itwould be advantageous to personalize channels to be viewed by a viewertailored to their particular interests and personal situations. Forexample, inclusion of personalized commercial advertisements will makesuch messages more relevant to the viewer, reducing the desire to changethe channel being viewed.

A method of creating personalized messages is disclosed in co-pendingU.S. patent application Ser. No. 09/841,465 filed on Apr. 24, 2001,which is incorporated herein by reference. One technique which canassist in the process of assembling personalized messages is the abilityto switch rapidly between multiple data streams (such as audio and/orvideo feeds) that are received simultaneously, in order to assemble themessage in real time, possibly as the message is being viewed by the enduser.

However, switching between multiple data streams is problematic. Oneproblem is that switching typically is not instantaneous. With presenttechnology, it is difficult to switch from one high-bandwidth digitaldata stream to another without missing some data in at least one of thestreams. This is true no matter what the type of data in the stream is,including audio, video, graphics, etc., or the type of switch, whetherhardware (such as an integrated circuit), software or a combination ofthe two. Also, timing the switch point to minimize data interruption isvery difficult. Switching between two streams typically results inartifacts due to loss of data or sometimes even introduction oferroneous data. For multimedia (such as television) signals, switchingintroduces very audible and visible artifacts in the sound and picture.

An example of this switching problem appears in MPEG based digitaltelevision. MPEG defines standards for digital television signals. TheMPEG standards include the capability for compressing, coding andtransmitting high-quality multi-channel, multimedia signals over avariety of broadband networks. MPEG encodes media signals as sequencesof frames, and switching between separate sequences of framesmultiplexed in, e.g., MPEG-2 transport streams takes a non-zero amountof time, and is usually partly executed in hardware and partly insoftware. In addition, only at certain moments in time a switch isactually allowed due to dependencies between groups of data in MPEG(frame accurate switching is required). To illustrate this further, FIG.1 shows switching and decoding components of an example digitaltelevision receiver 20. The transport stream 48 carrying multipleencoded data streams enters the demux (demultiplexer) 32. This demux 32serves as the switch, by selecting which video and which audio datastream in the multiplexed transport stream 48 to pass on. These streamsare then decoded respectively by a video decoder 42 and an audio decoder44 (with buffers 38 and 37 for the encoded data between the demux 32 andthe decoders 42 and 44). The results of the decoding are a stream ofvideo frames 40 and audio samples 38, which can then be sent to displayand audio equipment. The decoder is controlled by a receiver controller46, which typically uses a microprocessor and software.

Normally, when switching between different video and audio streamswithin the Transport Stream 48, the receiver controller 46 firstmutes/blanks the affected decoder, (as shown by arrows 43 and 45, thenswitches the Demux 32 settings and then unmutes/unblanks the decoder(s).This will present a moment of silence/black to the viewer. It will neverbe a seamless switch for the viewer.

In an attempt to get a seamless switch, the mute/unmute sequence may beskipped. Now, however, the results depend on the exact moment of theswitch with respect to the incoming data from the transport stream.Digital compression and transmission creates interdependencies betweengroups of video frames because of encoding and packaging and groups ofaudio samples because of packaging. Only at certain points within eachdata stream within the transport stream is it possible to switch out ofthe current stream without having visible and/or audible artifacts (safeexit point). Similarly only at certain points within each data streamwithin the transport stream, it is possible to switch into that streamwithout having visible and/or audible artifacts (safe entry point). Therequirement of exactly hitting a combination of safe exit and safe entrypoint make the seamless switch very difficult. In addition to this, thedecoders 42,44 are typically the only devices in the receiver 20 thatcan detect the right switching moments, while the demux 32 is the devicethat must be switched. Because of extensive data buffering between thedemux and the decoders, detection by the decoder is of no use todetermine the right moment to switch the demux.

One solution would be to build new receivers with specialized hardwareand software (possibly including additional buffering at severallocations in the receiver) to support seamless switching. However, thissolution increases the cost and complexity of receivers, and cannot takeadvantage of the existing receivers on the market

SUMMARY

In accordance with the present invention, there is provided atransmission system and method for seamlessly switching between aplurality of data streams to produce an output data stream with minimalor without any switching artifacts. Preferably, the seamless switchincludes no visual or audible artifacts during reproduction of data. Thedisclosed transmission system and method for seamless switching may beutilized in applications including broadcasts where frame and sampleaccurate switching in a digital television environment is required. Thesystem and method can facilitate multi-directional switching and doesnot require extensive modification to existing devices. Most desirably,the present disclosure finds application in personalized television.

The present invention includes a method of preparing a plurality of datastreams to allow seamless switching between the data streams by aswitching device that provides buffering of data. The method includesproviding a plurality of data streams, where the data streams includedata which is divided into segments that include synchronized startingpoints and end points on all of said plurality of data streams. Themethod includes providing gaps in the data streams between the endpoints and starting points, and increasing a data rate of the datastreams at a time before an end point of a segment. This increase of thedata rate can be performed by a number of techniques, alone or incombination, including (variations in) multiplexing and (variations in)compression. The present invention includes switching from one of theplurality of data streams to another one of the data streams at an endpoint of a segment Gap trigger indicators can be inserted in the datastreams proximate the end points, to indicate to a switch that a switchpoint is present or imminent. The switch exploits the presence of thegap to create the desired seamless switch.

An illustrative embodiment of the present invention is used to encodemultimedia data streams using MPEG and/or AC-3 compliant encoders andmultiplex the encoded streams into MPEG-2 transport streams. This allowsa receiver, such as a digital set top box, to seamlessly switch betweenmultiple channels and produce output with no switching artifacts.

U.S. Pat. No. 5,913,031 issued to Blanchard describes an encoder forsystem level buffer management. However this patent describesmaintaining a substantially constant data stream rate, with minorlong-term adjustments to keep a post-switch frame buffer(s) full. Thispatent uses complex signal data rate analysis to maintain full framebuffers, and does not disclose a feature of adjusting a data rate tomomentarily increase storage in the frame buffers. Further, this patentdoes not teach the creation of gaps in a data stream. Finally, thispatent focuses on maintaining full buffers only through compression ofdata.

An advantage of the present invention is a transmission system andmethod of seamlessly switching between a plurality of data streams toproduce an output data stream without any switching artifacts.

Another advantage of the present invention includes a system and methodfor seamless switching using presently deployed receiver systems such asdigital set top boxes (STB). No extra buffering is required to be addedto present receivers.

Another advantage of the present invention includes a system and methodfor seamless switching that facilitates multi-directional switching anddoes not require extensive modification to the existing devices.

Yet another advantage of the present invention includes a system andmethod for seamless switching that can be employed with personalizedtelevision applications.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, referred to herein and constituting a parthereof, illustrate the exemplary embodiments of the system and methodfor seamless switching of the present invention and, together with thedescription, serve to explain the principals of the invention.

FIG. 1 is a block diagram of an illustrative embodiment showing adigital switch for a digital television receiver;

FIG. 2 shows an example of a time-multiplexed data stream;

FIG. 3 is a block diagram showing a switch receiving multiple digitaldata streams;

FIG. 4 illustrates a switch receiving multiple digital data streams,where each data stream is split into separate segments;

FIG. 5 illustrates a switch receiving multiple segmented digital datastreams, where the start and end times of the segments are synchronized.

FIG. 6 illustrates a switch receiving multiple synchronized segmenteddigital data streams, including a separate control stream providingswitch trigger messages.

FIG. 7 illustrates a switch receiving multiple synchronized segmenteddigital data streams, including a separate control stream providingswitch trigger messages, where the segments are separated by means a ofa ‘switch gap’.

FIG. 8 shows a transport stream with a personalized message according toan illustrative embodiment;

FIG. 9 shows a video buffer occupancy in a normal situation;

FIG. 10 shows a video buffer occupancy with overflows resulting fromincreasing the transmission rate of video;

FIG. 11 shows a video buffer occupancy in a normal situation with thevideo encoder assuming a smaller video decoder buffer size;

FIG. 12 shows a video buffer occupancy without overflows resulting fromincreasing the transmission rate of video, encoded with a lower buffersize;

FIG. 13 shows an audio buffer occupancy in a normal situation;

FIG. 14 shows an audio buffer occupancy with overflows resulting fromincreasing the transmission rate of audio; and

FIG. 15 shows an audio buffer occupancy without overflows resulting fromincreasing the transmission rate of audio and multiplexing with a lowertarget buffer fill level.

DETAILED DESCRIPTION

The present invention finds utility in various data transmissionapplications including, but not limited to, transmission, reception anddecoding of digital broadcast television (whether distributed via cable,terrestrial, satellite, microwave, etc.), assembly and preparation oftelevision-on-demand (such as video-on-demand); encoding, multiplexing,and decoding of MPEG and AC-3 based multimedia streams; creation andplayback of Digital Versatile Disk (DVD); Transmission and reception ofdata streams over cellular, and internet networks, etc.

Generally, switching between two or more data streams that are receivedsimultaneously takes a non-zero amount of time, during which data fromone or both of the streams is lost. In a digital television receiver forexample, the input streams for audio and video are time multiplexed withother information into a transport stream. This time multiplexing makesit necessary to send the audio and video data in bursts and ahead of thepresentation time. The data is buffered in the receiver 20 FIG. 1 andplayed out at a predetermined moment relative to a presentation clock.Because of the way that digital television receivers are constructed,the switching mechanism that allows the selection of video and audiostreams from the transport stream is located before the playbackbuffering. And because the data is transmitted in burst-mode, there isno way to know the fill level of the buffer without detailed knowledgeof the incoming transport stream, and the current playback time of thereceiver. There is also no way to know where in time the other video (oraudio) stream transmissions are in relation to the current video (oraudio) stream, since such streams are typically not synchronized to eachother.

A further illustration of time-multiplexing different data streams isdepicted in FIG. 2. This figure shows the structure of the actual singlemultiplexed stream 43 that contains data from a number of different datastreams 45, multiplexed together. As can be seen, in this situation,only one data element 53 from one stream alters the switch at any momentin time.

Note that the present invention is not limited to situations oftime-multiplexed data. However, time-multiplexed data is a particularattractive situation for which this invention is applicable.

Also, the way MPEG video is encoded makes it necessary to switch at thestart of a video sequence, because otherwise the receiver has to waitfor the start of the next sequence (for example, wait for the nextI-Frame, which can easily take a few hundreds of milliseconds). Asimilar problem exists for audio, where if packets are missed, the audiosequence may be able to recover, but not without causing very audibleswitching artifacts.

All this makes the exact moment of switching over very critical, whilethere is no information embedded in the transport stream to find outwhat is being received in relation to the presentation time. The latencyof the (software and hardware in the) receiver processing system is alsotoo big to react on what is being received without knowing ahead of timewhat will be coming. The conclusion can be that without the properpreparation of the transport stream to give the receiver time to react,the results of a switch between streams will be non-deterministic.

Consider the general situation depicted in FIG. 3, in which a pluralityof digital datastreams 45 are received simultaneously by a switch 47.The datastreams 45 can originate from any type of source (e.g., adigital television broadcast, a storage system such as a hard disk, aDVD disk, a computer network, etc.). The data streams 45 can havevarying datarates (the amount of data they contain over a period oftime). The datarate per stream can even vary over time (example:VBRVariable BitRate, or VBR, encoded MPEG-2 video). Sometimes a streammight not contain data at all for a certain period of time. Theplurality of data streams can be provided to switch 47 in a variety ofways, for example time-multiplexed together (as in MPEG-2 digitaltelevision signals, usually together with other content data streams nothere depicted). Another option is frequency multiplexing. Yet anotheroption might be provision via separate physical channels.

The switch 47 can be programmed (for example, by external controlsoftware) to receive data from one of the incoming datastreams, andplace it in a buffer 49 which has a limited capacity. It is important tonote that the switch 47 can only receive data from one of the incomingdata streams 45 at the same time. The switch 49 can be implemented in avariety to ways, entirely in software, entirely in hardware, or acombination of the two.

Consumer 51 subsequently takes varying amounts of data from buffer 49 atdefined moments in time (e.g., each time a video frame must be decoded,it takes the data for the next frame from the buffer). The plurality ofdatastreams 45 is generated such that continuous selection of the datafrom any stream by the switch (after a certain small amount of start-upand initialization time) leads to neither under nor overflow of thebuffer 49 given the known behavior of consumer 51.

An example of the general situation depicted in FIG. 2 is the digitaltelevision receiver as depicted in FIG. 1. An MPEG-2 transport stream(TS) 48 enters the demultiplexer 32. The MPEG-2 TS 48 contains a varietyof different digital datastreams 45, together time-multiplexed in thisone single digital transport stream. The demultiplexer 32 essentially isa set of separate switches, controlled from software. In typicalmodern-day digital television receivers there are two dedicatedswitches, one for audio and one for video, capable of selecting onevideo stream 33 and one audio stream 35 from the potentially many audioand video streams contained in the transport stream 48. These streamsare forwarded to the dedicated audio decoder 44 and video decoder 42 viaaudio decoder buffer 37 and video decoder buffer 38. The audio and videodecoders produce decoded audio and video ready for presentation, forexample, on the screen/speakers. The decoders remove data from thebuffers 37 and 39 at defined moments in time.

Other switches in the demux 32 are responsible for selecting other,typically low-bandwidth, datastreams, and forwarding their contents 41to the control software 46.

In MPEG video, the behavior of the video buffer and decoder is modeledusing the VBV model, that specifies how individual video streams, beforemultiplexing, must be constructed to avoid buffer under or overflows inthe video decoder buffer. Furthermore, the MPEG standards model thebehavior of demultiplexer and decoders together in the T-STD model, thatdescribes how individual data streams must be multiplexed into an MPEG-2transport stream to avoid decoder underflows or overflows.

Existing digital television encoders and multiplexers are built toensure that single data streams, when played, will provide continuous,smooth, presentation. Once the switch has selected a channel/datastreamto play from (and after start-up), playback from that channel will besmooth until a switch is made.

Now consider the situation depicted in FIG. 4, where the data streams asintroduced in FIG. 3 are split-up into segments. Shown in FIG. 4 is agroup of ‘From’ segments {F(0), F(1), F(2)} and a group of ‘To’ segments{T(0), T(1), T(3)}. These segments are sequences of data constructedsuch that Consumer 51 (not shown) can provide seamless presentation ofany of the ‘From’ segments followed by any one of the ‘To’ segments. Inthe case that the segments are MPEG video, for example, it is requiredthat each such segment starts with an I frame, and ends with a closedGOP (Group of Pictures), meaning that there is no dependence on datacoming after it in the same data stream. Furthermore, all segments haveto be multiplexed relative to the same clock (for example, the so-calledPCR in MPEG-2 transport streams). Also, all last frames (in presentationorder) in the set of ‘from segments’ must have an identical presentationtime. Finally, all first frames (again, in presentation order) in the‘to’ segments must have an identical presentation time, which is oneframe time later than the presentation time of the last frame in the‘from’ segments. Together, if all these requirements are satisfied itmeans that, independent from what stream is being decoded, a switch isin principle possible.

The pair of ‘From’ segments and ‘To’ segments essentially defines aswitch point, where a transition between data streams is possiblewithout interruption. One particularly interesting utility of suchswitch points is the ability to create different storylines fordifferent end-viewers by choosing different sequences of segments.Switch points are the locations where a ‘safe’ transition (orcross-over) between data streams can be made, both from a technical andcreative viewpoint.

Playback by the consumer 51 (not shown) can obviously only be seamlessunder the provision that during the switch no data is lost and no extradata is introduced, and that the buffer 49 (not shown) is neitherunderflowing (insufficient data is present in the buffer when needed bythe consumer, so the presentation has to be interrupted) nor overflowing(the buffer is full, so switch 47 would have to discard data).

However, during the transmission and multiplexing of differentdatastreams, generally no attention is paid to the synchronization ofdata between the different streams, since these are normally independentfrom each other. In normal broadcast television, the viewer is intendedto watch one data stream, and when changing channels, a hiccup inpresentation is allowed since the new program is a completely differentprogram anyway. It is clear, however, that preparing the data streamsindependent from each other, as is done today, will not allow for aseamless transition. The present invention will disclose a series ofmethods to prepare the plurality of data streams such to obtain seamlessswitches between segments in the data streams.

In FIG. 4 shows the situation where the data from ‘From’ segments F(0),F(1) and F(2) is currently entering the switch. The last data elementsof each of these segments potentially enter the switch at slightlydifferent moments in tune e(0), e(1), and e(2) respectively, e.g.,because of time multiplexing of the data. A set of new segments T(0),T(1), T(3) is coming up. The first data elements of each of the ‘To’segments is entering the switch at time s(0), . . s(N), s(1), and s(3)respectively. The intention is that any switch from any of the ‘From’segments can be made to any of the ‘To’ segments. However, usingconventional multiplexing techniques it might well be that some of the‘To’ segments start slightly (e.g., a few transport stream packets inMPEG-2) before some of ‘From’ segments are ended, since these aremultiplexed independent from each other. Normally there would be noreason to have any synchronization between the segments in different,fully independent, data streams.

However, to be at all able to make a seamless switch from any of the‘from’ segments to any of the ‘to’ segments, and under the assumptionthat the switch cannot receive/buffer data from more than one datastream, the first data belonging to any of the ‘to’ segments cannotarrive at the switch earlier in time than the last data belonging to anyof the ‘from’ segments.

FIG. 5 shows the desired situation where the ‘From’ segments all endbefore any of the ‘To’ segments start. The multiplexer that producedthis particular stream has now taken into account that the start and endpositions of segments across data streams must be synchronized.

Another important requirement for a seamless switch is ensuring that noerroneous data flows into the buffers and/or decoders. For example,suppose that it is desirable in the situation of FIG. 5 to play segmentT(1) after F(0), Now suppose that switch 47 would switch directly afterthe end of segment F(0) to the data stream containing segment T(1). Ifthe switch would be made too fast, this could mean that some of the lastdata of segment F(1) would flow into buffer 49 (not depicted), sincesegment F(1) ends after segment F(0). This obviously is undesirable. Oneway to solve this problem is distributing all segments over their own(private) data streams, but this is wasteful in the amount of datastreams needed. A better option is using a dedicated ‘trigger’ messagethat tells the receiver when it is safe to make the switch. It isimportant to note that the switch should also not be made too late,since then data from the next segment might be lost. Therefore thetrigger message must preferably enter the switch directly after the lastdata element of the latest ending ‘From’ segment (in the example, F(1))has been consumed by the switch. Such a trigger can be transmitted on aseparate channel which is time synchronized with the other channels, orit can be contained in each of the data streams itself at theappropriate time (in the form of a data packet without actual data to beused by the consumer e.g., MPEG user data or an MPEG splice point). Thesituation with a trigger message on a separate channel is depicted inFIG. 6. The message appears on the separate control channel directlyafter the last data of the last ending option to not lose time.

In general, a certain amount of time goes by between detection of thejust described trigger message and the actual switching of the switch tothe new data stream. Even when the trigger message is received in time,the receiver software needs a certain non-zero amount of time to reacton receipt of the trigger message by the switch (usually via aninterrupt routine), and instruct the switch to switch (usually viadevice drivers).

The present invention also includes the introduction of so-called‘switch gaps’ 55, as shown in FIG. 7, between synchronized segments inthe channels data streams between which a switch can be made. A switchgap 55 typically is a period of ‘no data’ (or silence) on those channelsdata streams.

Switch gaps 55 can be introduced by exploiting the presence of buffer 49located after the switch. By temporarily filling this buffer more thannormal, the ‘From’ segments can all end earlier than normal, thuscreating the switch gaps. All data in the segments will still betransmitted, it will just arrive at the receiver earlier, and it willsit in the buffer longer. Essentially this means that the data for each‘From’ segment F will be transmitted earlier than normal, thus creatingthe gap.

Note that an alternative method for creating gaps is delaying the startof the ‘To’ segments. However, this usually leads to buffer underflowsin buffer 49. The only real way to avoid such underflows is starting the‘To’ segments at the usual time and ending the ‘From’ segments earlierthan usual by transmitting more data than usual. Although transmittingmore data on the data streams will lead to a higher bandwidth this is ingeneral no problem since transmission channels such as MPEG-2 transportstreams have spare bandwidth available (e.g., in the form of NULLpackets) to cover bursts of data.

The required size of the gap is the maximum time needed for the receiverto switch, and depends on the hardware and software of the receiver. Inthe case of STB (set top box) receivers, the gap timing may vary basedon the brand of STB. The gap size should typically be chosen toaccommodate the slowest switching time of a set of different STBsdeployed in a particular distribution network (such as a cable system).Experiments have shown that a typical digital television receiver willrequire a switch gap of around 30 msec., with 50 msec. being a realisticmaximum.

Various options for increasing the bandwidth to create the switchgapsexist. One issue that arises when creating gaps in MPEG-2 transportstreams is that it is not legal to burst too much data in a short timeinto either the audio or video buffers. There are clear rules stated inthe MPEG specification (more specifically, the section on the T-STDdecoder model), which govern how fast video, and audio data can be sent.These rules concern a small 512-byte receiving buffer known as thetransport buffer. Assuming a 27-megabit transport stream, for instance,video packets have to occupy on average not more than two out of everythree packets in the stream. Audio can only occupy on average about oneout of every 14 packets. Thus it is not sufficient to simply move gapdata close together slightly before the gap. The transmission pattern ofthe moved data must obey the transport buffer rate rules.

Given this consideration, one particularly attractive option to creatingthe switch gap is spreading out the increase in data over the entireduration of a segment (i.e., slightly increasing the datarate of theentire segment), since this strategy evenly distributes the gap dataover the segment, thus eliminating the risk for transport bufferoverflows. For instance, suppose we have a 3-second video segmentencoded at 4000000 bits per second, but it is desired to create a30-millisecond gap before the next segment starts. This 3 second segmentwill, in the normal case, also take 3 seconds to transmit at the bitrateof 4000000 bits per second. Now, by instead transmitting the segment ata bitrate of 4040000 bits per second instead of 4000000 bits per second,the 30 msec., gap (4000000*0.03/8=approx. 15000 bytes of data) isautomatically created at the end of transmission of this segment. Thisform of gap creation has the mentioned benefit of distributing thebuildup of the extra gap data over the entire length of the segment, andso avoiding the creation of a bandwidth bottleneck just before the gap.Other options for creating the gap are variations of this generalscheme, such as only increasing the bitrate from a certain point in thesegment. This strategy is attractive in situations of live/onlineencoding/multiplexing, where it is initially not known when a segmentwill end.

The person skilled in the art can see that existing MPEG-2 multiplexerscan easily be extended to multiplex individual data streams according tothe methods just described (i.e., synchronizing start and end ofindividual segments with a switch gap in between them, multiplexing dataat slightly datarates than the data actually has, and insertion oftrigger messages at the right time).

Furthermore, those skilled in the art can easily see that this model canbe generalized, for example to a situation with multiple switches, etc.(for example, such as a digital set-top box with both audio and videoswitching). The number of parallel streams at the switch point does notmatter, thus the present invention can be scaled up to any number ofstreams and provides the method to create multi-directional seamlessswitch point.

Although described mainly in terms of MPEG-2 transport streams, thepresent invention will work with any type of multiplexed data streams,such as SMPTE360-M, MPEG-1 Systems, MPEG-2 Program streams, MPEG-4systems, in any situation wherein some control is provided over the datastream rates and a buffer is provided after the switch and before theactual users (decoders) of the data. By utilizing the independencebetween reception and presentation, the present-invention introduces adiscontinuity in the transmission of all streams and thus creates anopportunity to switch without loss of data or introduction of unwanteddata.

The creation of switch gaps by sending data early has consequences forthe encoding of the segments in the data streams when these segments areMPEG video (whether MPEG 1, MPEG 2, MPEG 4, or any variation thereof).The creation of gaps must still result in a stream that is compliantwith the MPEG buffer models, such as the VBV and T-STD models. Thissection discloses an illustrative embodiment of an encoding method thatallows for such compliance in the presence of switch gaps.

When the segments in the datastreams are MPEG-encoded video, specificprecautions must be taken to not overflow the decoder buffer locatedbefore flic actual video decoder. The MPEG video buffer in MPEGdecoders/receivers has a fixed size, which can easily lead to overflowsof this buffer when sending video data early. Such an overflow leads todiscarding of data and consequently undesired playback artifacts.

A typical example of a video buffer size is 224 Kbyte as used fordecoding MPEG-2 MP@ML video (used in virtually all consumer digitalset-top boxes). For different profiles/levels/versions of MPEG,different buffer sizes exist, but the basic principle is the same.

While encoding a video segment, the video encoder takes the maximumbuffer size into account, and ensures that it never produces output thatcan overflow that buffer. However, this assumes that the data isentering and leaving the buffer at the normal (encoding) rate, which islower than the invented higher datarate necessary to create switch gaps.

FIG. 9 shows an example of the normal buffer fullness over time of anMPEG video decoder buffer (‘normal’ meaning that video data is nottransmitted early). The video is encoded such by the video encoder thatat the normal transmission speed (bitrate), the buffers will notoverflow. The transmission speed (buffer fill rate) is visible in theangle of the up-slopes, such as 105 a, in the graph Picture data istaken out of the buffer at defined moments in time 102, and subsequentlydecoded (for example, for interlaced PAL and NTSC video, frames aretaken out of the buffer typically 25 resp. 29.97 times per second).

MPEG video encoders will guarantee that the buffer level will alwaysstay below the defined maximum 101, assuming a normal transmission rate.As can be seen, the buffer fullness varies considerably over time,depending on the size of individual video frames such as the I and Pframes 106 a. Frame sizes are allocated by a video encoder as part ofits rate control algorithm. Typically, so-called ‘I’ (Intra) frames aremuch bigger than or ‘P’ frames.

Following the present invention, the data rate of video segments will beincreased to create a switch gap 108 as shows in FIG. 10. The time ofthe last data of the segment entering the buffer has been moved fromtime 109 a in FIG. 9 to an earlier time 109 b in FIG. 10.

The buffer fill rate 105 a of the video decoder buffer in FIG. 9 willtherefore increase to a fill rate 105 b before time 109 b, as shown inFIG. 10. As can be seen in the Figure, this leads to buffer overflowssince the data is loaded in the buffer earlier than normal. After thegap is finished at time 109 a, it can be seen that the buffer level inFIG. 10 is back at the same level as it was in FIG. 9 at time 109 a.From that time 109 a new data (e.g., for a new segment) starts enteringthe buffer, most likely again at a higher datarate to create a gap atthe end of that new segment, facilitating another seamless switch.

As simple calculation can illustrate how much the buffer can potentiallyoverflow given the amount of data needed to bridge the switchgap.Assume, for example that the video is encoded at a bitrate of 4000Kilobit/sec. Furthermore, assume that the desired switch gap is 30 msec.In this case the amount of data that has to be moved earlier in time is0.03*4000/8−15 Kbyte of data. Consequently, the video decoder buffer canoverflow as much as 15 Kbyte.

The fundamental reason that the video decoder buffer can overflow inthis situation, next to the working assumption of a (normal)transmission rate equal to the bitrate of the video, is the assumptiontaken by MPEG video encoders that they can make full use of the entirevideo decoder buffer (e.g., 224 Kbyte for MP@ML video) when makingdecisions on the sizes of the individual frames.

The present invention includes modifying existing encoders (orinstructing existing encoders, in case they have such settings) toassume a (usually slightly) smaller video decoder buffer than actuallyavailable. This would lead to different encoding decisions (assigningdifferent amounts of bytes to different frames) to keep the bufferoccupancy guaranteed below the new (lower) limit.

This technique is illustrated in FIG. 11. This Figure shows the samevideo as in FIG. 9, however now encoded with a reduced maximum buffersize 103 (e.g., 15 Kbyte less for 4 Mbps video and a desired gap of 30msec.). As can be seen, some of the frames 106 b have a different sizecompared to the same frames 106 a in FIG. 9, which is a result ofdecisions by the encoder to ensure that the buffer occupancy alwaysstays below the defined (lowered) maximum size. FIG. 12 shows the videofrom FIG. 11 after the switch gap 108 has been created by transmittingthe video data earlier. As can be seen, the maximum buffer level nowexceeds the maximum buffer level 103 as instructed to the video encoder,but it stays below the actual real maximum buffer size 101. Thus, nooverflows of the actual video decoder buffer will occur.

Another situation where encoding video with a slightly lower than realtarget decoder buffer size is useful is to ensure that two separatelyencoded video segments can be correctly played in sequence withoutrunning the risk of temporary buffer overflow. A state-of-the-art videoencoder allocates bytes to individual video frames such that, given thebitrate with which the video is encoded (the angle of the slopes 105 inthe buffer graphs), the buffer does not overflow. However, when encodingindividual segments, at the end of each such segment the encoder mayassume that no more data is flowing into the buffer (only the remainingdata is taken out of it). In such a case, the buffer might be very fulluntil the very last frame is taken out of it (e.g., a large I frame). Ifnow another video segment starts entering the buffer (e.g., after acreated gap), a potential for buffer overflow exists at the beginning ofthis new data entering the buffer.

One technique to solve this problem is always encoding video with alower buffer size, which enables streaming in of next segments to beplayed with a reduced risk of buffer overflow. Another option isadaptive encoding of segments with increasingly smaller buffer sizes,until overflows are avoided. Yet another option is artificially placingextra (dummy) frames after the end of the segment, encode the segment,and then removing the added frames. This tricks the encoder intothinking that more data will follow after the real last frame, andtherefore the encoder will not assume that no data will follow andconsequently the encoder will not allow the buffer to fill up.

For similar reasons as for video, also for audio certain precautionshave to be taken to not overflow or underflow the audio decoder'sbuffer. However, the situation is not as difficult as with video, sinceaudio frames usually have very similar sizes, and audio can arrive in adecoder just in time, since there are no time dependencies betweendifferent audio frames. No specific precautions have to be taken duringaudio encoding. Instead, a slightly different multiplexing scheme mighthave to be used.

For example, for MPEG-2 audio a typical buffer size is 3584 bytes whilefor AC-3 audio it is 2336 bytes. Assuming an audio bitrate of, e.g., 128Kbps and a required switch gap of 30 msec., this means that at least 492extra bytes need to be loaded into the audio decoder buffer prior to theswitch gap.

Normally, when multiplexing audio, a multiplexer has to decide what thetarget maximum buffer fullness level of the audio decoder buffer is(i.e., how many frames to keep in this buffer on average). For example,FIG. 13 shows the audio buffer occupancy over time where the multiplexerhas decided to keep the audio buffer rather full at a max level 142 a.Audio frames are taken out of the buffer at defined moments 144. Audiois transmitted at its normal encoding bitrate, indicated by the slopes145 a.

Creation of a switch gap will lead to buffer overflows as shown in FIG.14. Audio is transmitted at a slightly higher than normal bitrate 145,leading to the indicated overflow just before the switch gap 148. Theproblem is that the target bitrate 142 a is too high. Note that thebuffer occupancy is back at this target bitrate after the switch gap hasbeen passed.

An illustrative embodiment that addresses this problem is instructingthe multiplexer (or slightly modify an existing multiplexer) to keep thetarget audio buffer occupancy lower than normal, e.g., using a strategyas depicted in FIG. 15. By choosing a low normal buffer fullness level142 b (reached just before a frame is taken out of the buffer), themultiplexer now has space to insert audio segment with a slightly higherbitrate without exceeding buffer limits. The maximum allowed targetbuffer fullness can easily be computed from the normal buffer size, theaudio bitrate, and the desired switch gap. Given (he above example of128 Kbps encoded audio, a desired switch gap of 30 msec., and a normalaudio AC-3 decoder buffer of 2336 bytes, the target buffer full levelcan be at most 2336−492=1844 bytes.

An illustrative embodiment for the transport stream generation deviceand multi-direction seamless switching is the personalization of TVcommercials in a digital television environment. In this application apersonalized commercial would consist of several sequential time windows(slots), each having several parallel options. All options for a slotwould be transmitted simultaneously within the same transport streamand, at the beginning of each slot, a decision would be made by thereceiver which of the options for this slot to show to the viewer. Thepersonalized advertisement would be either inserted into the flow of themain program in a time slot that would be big enough for the totalpersonalized ad (typically 30 seconds), or it would be inserted in anentirely different transport stream, requiring the receiver totemporarily switch to that different transport stream for the durationof the personalized commercial.

FIG. 8 shows a transport stream 90 prepared according to an embodimentof the present invention. In this case content selection information hasbeen added by the personalization application. It consists of theindicated SIM (Sequence Identification Message, indicating the start ofthe message) 100, SOM (Sequence Option Message, indicating that a switchpoint is coming up, based on which the received will decide on the nextoption to play) 102 and SEM (Sequence End Message) 104, while the switchpoint trigger message is indicated by the SPM message 106. In thisexample the personalized ad comprises two segments 101, which havemultiple choices of media data, and are preceded by gaps 57 to allow forswitching time to an appropriate media data segment. The transportstream 90 shown indicates a personalized ad inserted into a mainprogram.

It will understood that various modifications may be made to theembodiments disclosed herein. Therefore, the above description shouldnot be construed as limiting, but merely as exemplification of thevarious embodiments. Those skilled in the art will envision othermodifications within the scope and spirit of the claims appended hereto.

1. A method comprising: receiving a first data stream comprising atleast a first segment; receiving a second data stream comprising atleast a second segment; and determining a switch gap size, wherein theswitch gap size comprises at least a predetermined amount of time neededto switch from transmitting the first segment of the first data streamto transmitting the second segment of the second data stream.